Moving image combining apparatus combining computer graphic image and at least one video sequence composed of a plurality of video frames

ABSTRACT

A coordinate light source calculating unit  103  calculates 3D coordinates for points forming one or more objects and performs clipping. A rendering unit  104  performs rendering and outputs the formed CG image to a frame buffer  108 . The coordinate light source calculating unit  103  calculates 3D coordinates forming a video display surface. A perspective projection unit  105  calculates 2D coordinates for each point forming the video display surface. An image decoder  106  decodes a video frame, and an image transform unit  107  transforms this image and outputs it to the frame buffer  108 , enabling the video frame to be pasted onto the CG image.

TECHNICAL FIELD

The present invention relates to a moving image combining apparatuscombining computer graphics with video images.

RELATED ART

The following processing is conventionally performed when combining anddisplaying computer graphics and video images in a virtual spacedisplayed by a computer using three-dimensional (hereafter 3D) graphics.A computer graphics (hereafter CG) image is generated by performinggraphics-generating calculations using coordinate values showinglocations and outlines for objects in a virtual 3D space. A video frameis extracted from a video sequence, and pasted onto the generated CGimage using a method known as texture mapping. The resulting image isthen displayed. High-speed repetition of this processing sequence forgenerating of a CG image, extracting of a video frame, pasting the videoframe onto the CG image and displaying the resulting image enables CGimages on which video frames have been pasted to be displayedsequentially, giving the appearance of a moving image.

However, the respective display rates for computer graphics and videoimages prior to combining are not necessarily identical. In a videosequence, a fixed number of frames can be displayed during a fixed time(this is hereafter to as the display rate). One standard for the videoimage display rate is 30 frames per second. In contrast, for computergraphics, the calculation time required to generate a CG image fromcoordinate values for object locations and outlines varies according tothe number of objects to be displayed. As a result, achieving a uniformdisplay rate is normally difficult.

Suppose the video image and computer graphics display rates arerespectively 30 and 10 frames per second and moving images are combinedat the computer graphics display rate. This means that, of the 30 videoframes that can potentially be displayed in one second, only the 10frames coinciding with display times of CG images can be displayed.Consequently, the remaining 20 frames cannot be displayed, so that themovement of the video sequence is jerky.

If moving images are combined at the video image display rate, however,the calculation required to generate a CG image cannot be completed inthe interval between the display of consecutive video frames, meaningthat it may not be possible to generate a CG image on every occasion.

DISCLOSURE OF THE INVENTION

In order to overcome the above problems, an object of the presentinvention is to provide a moving image combining apparatus combiningcomputer graphics and video images at their respective display rates, amoving image combining method, and a recording medium recording aprogram for combining moving images.

An invention achieving the above object is a moving image combiningapparatus combining computer graphics images (hereafter referred to asCG images) and at least one video sequence composed of a plurality ofvideo frames, the moving image combining apparatus including thefollowing. An information storage unit storing object informationshowing an outline and location for at least one object inthree-dimensional (3D) space, a video obtaining unit obtaining from anexternal source at least one video sequence composed of a plurality ofvideo frames generated at a fixed video display rate, an image storageunit, a receiving unit for receiving position information showing aposition of a moving viewpoint, a graphics generating unit forgenerating CG images one at a time at a graphics display rate and, oncompleting the generation of a CG image, writing the CG image into theimage storage unit, the CG image obtained by projecting each objectwhose outline and location is shown by the object information onto aprojection surface, as seen from a current position of the movingviewpoint shown by the position information, and a video framegenerating unit for fetching at least one video frame from the at leastone video sequence at the video display rate and writing the fetched atleast one video frame over a CG image, the CG image being stored in theimage storage unit immediately prior to the time that the at least onevideo frame was fetched.

This construction enables generating of CG images and decoding of videoframes to be performed in parallel using separate processes, and thegenerated CG image and still video to be combined in the storage unit.As a result, computer graphics and video images can be combined at theirrespective display rates.

The graphics generating unit may further perform rendering on eachgenerated CG image, and write the rendered CG images into the imagestorage unit.

This construction enables rendering to be performed on a CG image, sothat realistic graphics can be obtained.

Here, the following construction may also be used. Each object includesat least one video display area. The moving image combining apparatuscombines, on at least one video screen located on the projectionsurface, at least one video sequence and a CG image, each video screencorresponding to a video display area. The object information includesinformation showing an outline and location for each video display area.The graphics generating unit further calculates screen informationshowing an outline and location for each video screen, each video screenobtained by projecting a video display area shown by an outline andlocation in the object information onto the projection surface. Thevideo frame generating unit overwrites fetched video frames at eachlocation shown by the screen information, so that each fetched videoframe fits an outline shown in the screen information.

This construction enables a video sequence to be combined on a videoscreen of an object.

The video frame generating unit may also be constructed so that itgenerates transformed video frames by transforming the fetched videoframes to fit an outline shown in the screen information; and overwritesthe transformed video frames into the image storage unit.

This construction enables a video frame to be transformed to fit thevideo screen of the object, enabling the video sequence to be combinedmore realistically.

The following construction may also be used. Each object has a pluralityof video display areas. The video obtaining unit obtains a plurality ofvideo sequences from an external source. The moving image combiningapparatus combines, on each of a plurality of video screens on aprojection surface, one of the video sequences with a CG image, eachvideo screen corresponding to one of the plurality of video displayareas. The object information includes information showing outlines andlocations for a plurality of video display areas. The graphicsgenerating unit calculates screen information for each piece ofinformation showing the outline and location for one of the plurality ofvideo display areas. The video frame generating unit fetches videoframes from each of the plurality of video sequences, and overwritesfetched video frames from the different video sequences at the differentlocations shown by the plurality of pieces of screen information, sothat the fetched video frames fit the outlines shown in the screeninformation.

This construction enables video sequences to be combined on each of aplurality of video screens, when an object has a plurality of videodisplay areas.

The video frame generating unit may also include the following. Apriority ranking determining unit for determining a priority ranking foreach video screen based on the plurality of pieces of calculated screeninformation. A video decoding unit for obtaining video frames from eachof the plurality of video sequences, based on the determined priorityranking. A masking location calculating unit for calculating locationsto be masked on each video screen, based on the plurality of pieces ofcalculated screen information and the priority ranking determined foreach video screen. A masking unit for masking the transformed videoframes at the calculated locations. Here, the video frame generatingunit overwrites the transformed video frames which have been masked intothe image storage unit.

This construction enables priority rankings to be determined accordingto video screens of objects, video frames to be obtained from videosequences based on the priority rankings, and masking to be performed oneach video screen, so that video sequences can be combined morerealistically.

The priority ranking determining unit may determine priority rankingsusing the plurality of pieces of calculated screen information, withvideo screens nearer to the viewpoint having a higher priority ranking.

This construction enables video screens nearer to a viewpoint to begiven a higher priority ranking, so that video sequences can be combinedmore realistically.

The priority ranking determining unit may determine priority rankingsusing the plurality of pieces of calculated screen information, withvideo screens calculated as having a larger surface area having a higherpriority ranking.

This construction enables video screens with a larger area to be given ahigher priority ranking, so that a higher quality picture can beobtained.

The video decoding unit may obtain all of the video frames from a videosequence with the highest priority ranking, and omit more video framesfrom video sequences with lower priority rankings.

This construction enables a greater number of frames to be skipped atlower priority rankings, so that the picture quality of decoded videoframes can be adjusted according to the priority ranking.

The video decoding unit may include an image quality adjustment unitreducing luminance of obtained video frames, and does not reduce theluminance of video frames from the video sequence with the highestpriority ranking, while reducing the luminance of video frames fromvideo sequences with lower priority rankings.

This construction enables luminance to be decreased at lower priorityrankings, so that flickering is not noticeable for lower-ranked videodisplay surfaces likely to have a low display rate.

The invention may also be a moving image combining apparatus combiningthree-dimensional CG images and at least one video sequence composed ofa plurality of video frames. The moving image combining apparatusincludes the following. An information storage unit for storing objectinformation showing an outline and location for each object, and anoutline and location for at least one video display area for eachobject, a video obtaining unit for obtaining from an external source atleast one video sequence composed of a plurality of video framesgenerated at a fixed video display rate, a CG image storage unit, avideo frame storage unit, an image storage unit, a receiving unit forreceiving position information showing a position of a moving viewpoint,a graphics generating unit for generating CG images one at a time at agraphics display rate and on completing the generation of a CG image,writing the CG image into the CG image storage unit, the CG imageobtained by projecting each object whose outline and location is shownby the object information onto a projection surface, as seen from acurrent position of the moving viewpoint shown by the positioninformation; and calculating screen information showing an outline andlocation for at least one video screen obtained by projecting each videodisplay area shown by an outline and location in the object informationonto the projection surface, a video frame generating unit for fetchingat least one video frame from the at least one video sequence at thevideo display rate and overwriting the fetched at least one video frameinto the video frame storage unit, and a selecting unit for selectingelements forming still images from the at least one video frame writtenin the video frame storage unit and a CG image written in the CG imagestorage unit, the CG image being written in the CG image storage unitimmediately prior to the time that the at least one video frame wasfetched; and writing the selected elements in the image storage unit.

This construction enables generating of CG images and decoding of videoframes to be performed in parallel as separate processes. As a result,computer graphics and video images can be combined at their respectivedisplay rates, and images can be combined via a selection signal, sothat the construction of the apparatus can be simplified.

The CG image storage unit may include a first graphics storage unit anda second graphics storage unit and the video frame storage unit mayinclude a first video storage unit and a second video storage unit. Thegraphics generating unit writes obtained CG images alternately in thefirst and second graphics storage units. The video frame generating unitwrites obtained video frames alternately in the first and secondgraphics storage units. The selecting unit reads a CG image from thesecond graphics storage unit while the graphics generating unit iswriting a CG image into the first graphics storage unit, and reads a CGimage from the first graphics storage unit while the graphics generatingunit is writing a CG image into the second graphics storage unit. Theselecting unit also reads a video frame from the second video storageunit while the video frame generating unit is writing a video frame intothe first video storage unit, and reads a video frame from the firstvideo storage unit while the video frame generating unit is writing avideo frame into the second video storage unit. Then the selecting unitselects elements forming still images from the read CG images and videoframes.

This construction enables generating of CG images, decoding of videoframes, and combining of generated CG images with video frames to beperformed in parallel as separate processes. As a result, computergraphics and video images can be combined at their respective displayrates, and generating of CG images, decoding of video frames andcombining of CG images with video frames can be performed more quickly.

The graphics generating unit may further perform rendering on eachgenerated CG image, and write the rendered CG images into the imagestorage unit.

This construction enables graphics to be rendered, so that computergraphics can be generated more realistically.

The video frame generating unit may generate transformed video frames bytransforming the fetched video frames to fit an outline shown in thescreen information; and overwrite the transformed video frames into theimage storage unit.

This construction enables video frames to be transformed to fit theoutline of the video screen of each object, so that a video sequence canbe combined more realistically.

The following construction may also be used. Each object has a pluralityof video display areas. The video obtaining unit obtains a plurality ofvideo sequences from an external source. The moving image combiningapparatus combines, on each of a plurality of video screens on aprojection surface, one of the video sequences with a CG image, eachvideo screen corresponding to one of the plurality of video displayareas. The object information includes information showing outlines andlocations for a plurality of video display areas. The graphicsgenerating unit calculates screen information for each piece ofinformation showing the outline and location for one of the plurality ofvideo display areas. The video frame generating unit fetches videoframes from each of the plurality of video sequences, and overwritesfetched video frames from the different video sequences at the differentlocations shown by the plurality of pieces of screen information, sothat the fetched video frames fit the outlines shown in the screeninformation.

This construction enables video images to be combined on a plurality ofvideo screens of objects, when an object has a plurality of videodisplay areas.

The video frame generating unit may include the following. A priorityranking determining unit for determining a priority ranking for eachvideo screen based on the plurality of pieces of calculated screeninformation. A video decoding unit for obtaining video frames from eachof the plurality of video sequences, based on the determined priorityranking. A masking location calculating unit for calculating locationsto be masked on each video screen, based on the plurality of pieces ofcalculated screen information and the priority ranking determined foreach video screen. A masking unit for masking the transformed videoframes at the calculated locations. Here, the video frame generatingunit overwrites the transformed video frames which have been masked intothe image storage unit.

This construction enables priority rankings to be determined accordingto video screens of objects, video frames to be obtained from videosequences based on the priority rankings, and masking to be performed oneach video screen, so that video sequences can be combined morerealistically.

The priority ranking determining unit may determine priority rankingsusing the plurality of pieces of calculated screen information, withvideo screens nearer to the viewpoint having a higher priority ranking.

This construction enables video screens nearer the viewpoint to be givena higher priority ranking, so that video sequences can be combined morerealistically.

The priority ranking determining unit may determine priority rankingsusing the plurality of pieces of calculated screen information, withvideo screens calculated as having a larger surface area having a higherpriority ranking.

This construction enables video screens with larger areas to receive ahigher priority ranking, so that picture quality can be increased.

The video decoding unit may obtain all of the video frames from a videosequence with the highest priority ranking, and omit more video framesfrom video sequences with lower priority rankings.

This construction enables a larger number of frames to be skipped atlower priority rankings, so that picture quality for decoded videoframes can be adjusted according to the priority ranking.

The video decoding unit may include an image quality adjustment unitreducing luminance of obtained video frames, and does not reduce theluminance of video frames from the video sequence with the highestpriority ranking, while reducing the luminance of video frames fromvideo sequences with lower priority rankings.

This construction enables luminance to be decreased at lower priorityrankings, so that flicker is not noticeable for lower-ranked videodisplay surfaces likely to have a low display rate.

A moving image combining method for combining CG images and at least onevideo sequence composed of a plurality of video frames may also be used.The moving image combining method is used by a moving image combiningapparatus having an information storage unit and an image storage unit,the information storage unit storing object information showing anoutline and location for at least one object in three-dimensional space.The moving image combining method includes the following. A videoobtaining step obtaining from an external source at least one videosequence composed of a plurality of video frames generated at a fixedvideo display rate. A receiving step receiving position informationshowing a position of a moving viewpoint. A graphics generating stepgenerating CG images one at a time at a graphics display rate and, oncompleting the generation of a CG image, writing the CG image into theimage storage unit, the CG image obtained by projecting each objectwhose outline and location is shown by the object information onto aprojection surface, as seen from a current position of the movingviewpoint shown by the position information. A video frame generatingstep fetching at least one video frame from the at least one videosequence at the video display rate and writing the fetched at least onevideo frame over a CG image, the CG image being stored in the imagestorage unit immediately prior to the time that the at least one videoframe was fetched.

When using this method, the same effects are apparent as for the movingimage combining apparatus.

A moving image combining method for combining, on an video display area,CG images and at least one video sequence composed of a plurality ofvideo frames may also be used. The moving image combining method is usedby a moving image combining apparatus having an information storageunit, a CG image storage unit, a video frame storage unit, and an imagestorage unit, the information storage unit storing object informationshowing an outline and location for at least one object, and an outlineand location for a video screen for each object, in three-dimensionalspace. The moving image combining method includes the following. A videoobtaining step obtaining from an external source at least one videosequence composed of a plurality of video frames generated at a fixedvideo display rate. A receiving step receiving position informationshowing a position of a moving viewpoint. A graphics generating stepgenerating CG images one at a time at a graphics display rate and, oncompleting the generation of a CG image, writing the CG image into thegraphics storage unit, the CG image obtained by projecting each objectwhose outline and location is shown by the object information onto aprojection surface, as seen from a current position of the movingviewpoint shown by the position information and calculating screeninformation showing an outline and location for at least one videoscreen, the video screen obtained by projecting the at least one videodisplay area shown by an outline and location in the object informationonto the projection surface. A video frame generating step fetching atleast one video frame from the at least one video sequence at the videodisplay rate and overwriting the fetched at least one video frame in thevideo frame storage unit. A selecting step selecting elements formingstill images from the at least one video frame written in the videoframe storage unit and a CG image written in the CG image storage unit,the CG image being written in the CG image storage unit immediatelyprior to the time that the at least one video frame was fetched, andwriting the selected elements in the image storage unit.

When using this method, the same effects are apparent as for the movingimage combining apparatus.

The invention may also be realized using a recording medium recording amoving image combining program combining CG images and at least onevideo sequence composed of a plurality of video frames. The moving imagecombining program used by a computer having an information storage unitand an image storage unit, the information storage unit storing objectinformation showing an outline and location for at least one object inthree-dimensional space. The moving image combining program includingthe following. A video obtaining step obtaining from an external sourceat least one video sequence composed of a plurality of video framesgenerated at a fixed video display rate. A receiving step receivingposition information showing a position of a moving viewpoint. Agraphics generating step generating CG images one at a time at agraphics display rate and, on completing the generation of a CG image,writing the CG image into the image storage unit, the CG image obtainedby projecting each object whose outline and location is shown by theobject information onto a projection surface, as seen from a currentposition of the moving viewpoint shown by the position information. Avideo frame generating step fetching at least one video frame from theat least one video sequence at the video display rate and writing thefetched at least one video frame over a CG image, the CG image beingstored in the image storage unit immediately prior to the time that theat least one video frame was fetched.

When this program is executed by a computer, the same effects areapparent as for the moving image combining apparatus.

The invention may also use a recording medium recording a moving imagecombining program combining, on a video display area, CG images and atleast one video sequence composed of a plurality of video frames. Themoving image combining program used by a computer having an informationstorage unit, a CG image storage unit, a video frame storage unit, andan image storage unit, the information storage unit storing objectinformation showing an outline and location for at least one object, andan outline and location for a video display area for each object, inthree-dimensional space. The moving image combining program includes thefollowing. A video obtaining step obtaining from an external source atleast one video sequence composed of a plurality of video framesgenerated at a fixed video display rate. A receiving step receivingposition information showing a position of a moving viewpoint. Agraphics generating step generating CG images one at a time at agraphics display rate and, on completing the generation of a CG image,writing the CG image into the graphics storage unit, the CG imageobtained by projecting each object whose outline and location is shownby the object information onto a projection surface, as seen from acurrent position of the moving viewpoint shown by the positioninformation and calculating screen information showing an outline andlocation for at least one video screen, the video screen obtained byprojecting the at least one video display area shown by an outline andlocation in the object information onto the projection surface. A videoframe generating step fetching at least one video frame from the atleast one video sequence at the video display rate and overwriting thefetched at least one video frame in the video frame storage unit. Aselecting step selecting elements forming still images from the at leastone video frame written in the video frame storage unit and a CG imagewritten in the CG image storage unit, the CG image being written in theCG image storage unit immediately prior to the time that the at leastone video frame was fetched, and writing the selected elements in theimage storage unit.

When this program is executed by a computer, the same effects areapparent as for the moving image combining apparatus.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows an external view of a moving image combining apparatus 10in a first embodiment of the present invention;

FIG. 2 is a block diagram showing the construction of the moving imagecombining apparatus 10;

FIG. 3 shows an example of an object table recorded in a data storageunit 102;

FIG. 4 shows the construction of data in an MPEG stream recorded in thedata storage unit 102;

FIG. 5 shows an example screen displayed by a display unit 109;

FIG. 6 shows data in each part of the processing performed by the movingimage combining unit 10;

FIG. 7 is a flowchart showing the operation of the moving imagecombining unit 10;

FIG. 8 is a timechart showing the timing of operations performed by themoving image combining unit 10;

FIG. 9 shows the combining of a CG image and a video frame performed inthe related art;

FIG. 10 shows the combining of a CG image and a video frame performed bythe moving image combining unit 10;

FIG. 11 shows an external view of a digital broadcast receivingapparatus 20 in an alternative to the first embodiment of the presentinvention;

FIG. 12 is a block diagram showing a structure for the digital broadcastreceiving apparatus 20;

FIG. 13 is an example screen displayed by the display unit 109;

FIG. 14 shows data in each part of the processing performed by thedigital broadcast receiving apparatus 20;

FIG. 15 is a flowchart showing the operation of the digital broadcastreceiving apparatus 20;

FIG. 16 is a flowchart showing the operation of priority rankingcalculations performed by the digital broadcast receiving apparatus 20;

FIG. 17 is a timechart showing the timing of operations performed by thedigital broadcast receiving apparatus 20;

FIG. 18 is a block diagram showing a structure for a moving imagecombining apparatus 30 in a further alternative to the first embodimentof the present invention;

FIG. 19 is an example of control data stored in a control data storageunit in the moving image combining apparatus 30;

FIG. 20 shows data in each part of the processing performed by themoving image combining apparatus 30;

FIG. 21 shows the relation between CG images, video frames and controldata in the moving image combining apparatus 30;

FIG. 22 is a flowchart showing the operation of the moving imagecombining apparatus 30;

FIG. 23 is a flowchart showing the operation of combining performed bythe moving image combining apparatus 30;

FIG. 24 is a timechart showing the timing of operations performed by themoving image combining apparatus 30;

FIG. 25 is a block diagram of a digital broadcast receiving apparatus 40in a further alternative to the first embodiment of the presentinvention;

FIG. 26 shows data in each part of the processing performed by themoving image combining apparatus 40;

FIG. 27 is a timechart showing the timing of operations performed by thedigital broadcast receiving apparatus 40;

FIG. 28 is a block diagram showing a structure for a digital broadcastreceiving apparatus 50 in a further alternative to the first embodimentof the present invention;

FIG. 29 is a timechart showing the timing of other operations performedby the moving image combining apparatus 10; and

FIG. 30 is a timechart showing the timing of other operations performedby the moving image combining apparatus 10.

BEST MODE FOR CARRYING OUT THE INVENTION

The following is a detailed description of the embodiments of thepresent invention with reference to the drawings.

1 First Embodiment

The following is a description of a moving image combining apparatus 10in a first embodiment of the present invention.

1.1 Structure of Moving Image Combining Apparatus 10

As shown in FIG. 1, the moving image combining apparatus 10 isconstructed from a main unit 11, a CD-ROM drive 12 in which a CD-ROM isloaded, a processor 13 for executing programs, a semiconductor memory 14storing programs and data, a monitor 15, a keyboard 16, speakers 17 anda mouse 18. The moving image combining apparatus 10 reads objectinformation for 3D objects and video image information recorded on theCD-ROM, generates a graphic image, pastes a video image onto thegenerated graphic image and displays the combined image on the monitor15.

FIG. 2 is a functional block diagram of the moving image combiningapparatus 10. In the drawing, the moving image combining apparatus 10 isconstructed from an input unit 101, a data storage unit 102, acoordinate light source calculating unit 103, a rendering unit 104, aperspective transform unit 105, an image decoder 106, an image transformunit 107, a frame buffer 108 and a display unit 109.

(1) Input Unit 101

The input unit 101 includes the keyboard 16, the mouse 18, and the like.

The input unit 101 receives input from buttons 321 to 327 in anavigation instruction menu 303. The buttons 321 to 327 receive forward,back, left, right, up, down and operation end instructions respectively.When the input unit 101 receives an input from one of the buttons 321 to327, it outputs information showing forward, back, left, right, up, downor operation end to the coordinate light source calculating unit 103.

The input unit 101 receives input from the buttons 321 to 327 at a rateof ten times per second.

(2) Data Storage Unit 102 The data storage unit 102 is constructed froma CD-ROM and the CD-ROM drive 12, in which the CD-ROM is loaded. Data isrecorded on the CD-ROM, and the CD-ROM drive 12 reads this data asrequested.

The data storage unit 102 stores an object table 201 and an MPEG stream221, shown in FIGS. 3 and 4.

The object table 201 contains information relating to objects located ina 3D coordinate space A, and consists of data groups containing anobject name 211, outline coordinates 212, location coordinates 213, andvideo display surface coordinates 214. Each group corresponds to oneobject.

The object name 211 identifies an object.

The outline coordinates 212 are a plurality of sets of 3D coordinatevalues in a 3D coordinate space B. Each of these sets shows one of thepoints forming an object. One of the points is located at the origin ofthe 3D coordinate space B.

The location coordinates 213 are one set of 3D coordinate values in the3D space A. This coordinate set shows the location of the aforementionedpoint in the 3D space A. The video display surface coordinates 214 are aplurality of sets of 3D coordinate values in the 3D space B. These setsform part of the outline coordinates 212, and are selected so as torepresent a surface with a limited area. The surface represented by thevideo display surface coordinates 214 is displayed with a video sequencepasted onto it. This surface is referred to as the video display surface(the video display area).

The MPEG stream 221 is a code sequence formed by compressing andencoding moving video according to the MPEG (Moving Picture ExpertsGroup) standard. The MPEG stream 221 is constructed from a plurality ofSH (sequence header) and GOP (group of pictures) pairs, as shown in FIG.4. A GOP includes a plurality of pictures, each of which corresponds toa one-frame still image. A picture includes a plurality of slices, andeach slice includes a plurality of macroblocks (MBs). Each macroblock isin turn made up of 4 luminance blocks Y, and two chrominance blocks Cband Cr. A block is constructed from 8×8 elements, making 64 elements intotal. Since this technology is well-known in the art, a more detailedexplanation will be omitted.

Consecutive pictures are decoded in turn, giving an appearance ofmotion.

(3) Coordinate Light Source Calculating Unit 103

The coordinate light source calculating unit 103 is constructed from theprocessor 13, programs stored in the semiconductor memory 14, and thelike.

The coordinate light source calculating unit 103 stores viewpointcoordinates E (Ex, Ey, Ez) located in the 3D coordinate space A, andalso receives information showing forward, back, left, right, up, downand operation end from the input unit 101.

Upon receiving information showing a forward, back, left, right, up ordown movement, the coordinate light source calculating unit 103 performsthe following calculations for each of the viewpoint coordinates Eaccording to the received information.

Ey=Ey+1

Ey=Ey−1

Ex=Ex+1

Ex=Ex−1

Ez=Ez+1

Ez=Ez+1

Furthermore, the coordinate light source calculating unit 103 reads theoutline coordinates 212, the location coordinates 213 and the videodisplay surface coordinates 214 for each object from the object table201 stored in the data storage unit 102. The coordinate light sourcecalculating unit 103 adds each value shown by the location coordinates213 to each value shown by the outline coordinates 212, calculating 3Dcoordinates forming the object in the 3D coordinate space A.

Here, the coordinate light source calculating unit 103 calculatestwo-dimensional (2D) coordinates and depth values in relation to a planeH, located virtually in the 3D coordinate space A between the objectsand the viewpoint coordinates E (Ex, Ey, Ez). The 2D coordinatesrepresent each point of an object when it is projected onto the plane Hseen from the viewpoint coordinates E, and the depth values representthe distance by which these points are separated from the plane H in thedepth direction. Next, the coordinate light source calculating unit 103performs clipping by using the 2D coordinates and depth values, therebyextracting the parts displayed in the window of the monitor 15. Thecoordinate light source calculating unit 103 then outputs to therendering unit 104 2D coordinates on the plane H and depth valuesshowing the distance from plane H in the depth direction for pointsbelonging to each object that has been clipped. Clipping and the methodused to calculate the 2D coordinates and the depth values are well-knownin the art and so explanation of these processes is omitted here.

Similarly, the coordinate light source calculating unit 103 addscoordinate values shown by the location coordinates 213 to coordinatevalues shown by the video display surface coordinates 214, therebycalculating 3D coordinate values for points in the 3D coordinate space Aforming a video display surface, and outputs the calculated 3Dcoordinates to the perspective transform unit 105.

(4) Rendering Unit 104

The rendering unit 104 is constructed from the processor 13, programsstored in the semiconductor memory 14, and the like.

The rendering unit 104 receives 2D coordinates and depth values for eachobject from the light source calculating unit 103, and performsrendering using the received 2D coordinates and depth values. Thisincludes hidden line/surface deletion for deleting lines and surfacesthat cannot be seen since they are hidden behind another object when theobject is viewed from the viewpoint coordinates, displaying surfaceshading to make objects appear more realistic, displaying surface color,and texture mapping. The rendering unit 104 then forms a CG image frombitmap data and outputs it to the frame buffer 108. Here, the CG imageis formed from a 640-pixel×480-pixel luminance signal image Y totaling307 200 pixels, a 320-pixel×240-pixel chrominance signal image Cbtotaling 76 800 pixels and a 320-pixel×240-pixel chrominance signalimage Cr totaling 76 800 pixels. Each pixel has 8 bits.

Rendering processes such as hidden line/surface deletion, shadingdisplay, color display and texture mapping are well-known in the art, sofurther explanation will be omitted here.

(5) Perspective Transform Unit 105

The perspective transform unit 105 is constructed from the processor 13,programs stored in the semiconductor memory 14, and the like.

The perspective transform unit 105 receives 3D coordinates for pointsforming a video display surface in the 3D coordinate space A from thecoordinate light source calculating unit 103. The perspective transformunit 105 then calculates 2D coordinates on the plane H for pointsforming the video display surface and outputs the calculated 2Dcoordinates to the image transform unit 107, in the same way as thecoordinate light source calculating unit 103.

(6) Image Decoder 106

The image decoder 106 is constructed from the processor 13, programsstored in the semiconductor memory 14, and the like.

The image decoder 106 reads the MPEG stream 221 stored in the datastorage unit 102, repeatedly generates video frames by decoding datafrom the read MPEG stream 221, and outputs the generated video frames tothe image transform unit 107. The method used to generate video framesfrom an MPEG stream is well-known in the art, and so explanation will beomitted here.

The image decoder 106 decodes video frames at a rate of 30 frames persecond.

(7) Image Transform Unit 107

The image transform unit 107 is constructed from the processor 13,programs stored in the semiconductor memory 14, and the like.

The image transform unit 107 receives a video frame from the imagedecoder 106 and 2D coordinates for points forming the video displaysurface from the perspective transform unit 105. Next, the imagetransform unit 107 changes the received video frame to the outlinerepresented by the received 2D coordinates using an affine transform,and generates a transformed video frame. The image transform unit 107outputs the transformed video frame to the frame buffer 108 by writingit over the area represented by the received 2D coordinates. Here, thetransformed video frame is constructed from a plurality of pixels, eachof which has eight bits.

(8) Frame Buffer 108

The frame buffer 108 is constructed from the semiconductor memory 14 orsimilar and stores still images.

(9) Display Unit 109

The display unit 109 is constructed from the monitor 15 or similar.

The display unit 109 displays a screen 301, as shown in FIG. 5. Thescreen 301 includes a display window 302 and a navigation instructionmenu 303.

Buttons 321 to 327 are displayed in the navigation instruction menu 303.These buttons 321 to 327 receive forward, back, left, right, up, downand operation end instructions respectively.

The display unit 109 displays a still image, stored in the frame buffer108, in the display window 302.

1.2 Operation of the Moving Image Combining Apparatus 10

(1) Operation of the Moving Image Combining Apparatus 10

The operation of the moving image combining apparatus 10 is explainedwith reference to FIGS. 6 and 7. FIG. 6 shows data in each processperformed by the moving image combining apparatus 10, and FIG. 7 is aflowchart showing the operation of the moving image combining apparatus10.

The coordinate light source calculating unit 103 reads outlinecoordinates 212, location coordinates 213 and video display surfacecoordinates 214 (object information 401) for each object from the objecttable 201 in the data storage unit 102 (step S101), and receivesinformation showing a forward, back, left, right, up, down or operationend instruction from the input unit 101 (step S102). If informationshowing an operation end instruction is received, the coordinate lightsource calculating unit 103 ends the processing (step S103). Ifinformation showing another type of instruction is received (step S103),the light source coordinate calculating unit 103 calculates viewpointcoordinates E according to the received information, calculates 3Dcoordinates for points forming each object in the 3D coordinate space A,2D coordinates for points on the plane H, and the depth values showingthe distance of each point from the plane H in the depth direction (thelatter two forming information 402), and performs clipping (step S104).The rendering unit 104 performs rendering such as deletion of hiddenlines/surfaces, display of surface shading, display of surface color,and texture mapping by using the 2D coordinates and depth values, andforms a CG image (image 403) as a bitmap image (step S105). Therendering unit 104 then outputs the CG image to the frame buffer 108(step S106). Next, the routine returns to step S102 and the aboveprocessing is repeated.

Following step S104, the coordinate light source calculating unit 103also calculates 3D coordinates for points forming the video displaysurface in the 3D coordinate space A, and the perspective projectionunit 105 calculates 2D coordinates (information 405) on the plane H forpoints forming a video display surface (step S111).

Meanwhile, the image decoder 106 reads the MPEG stream 221 stored in thedata storage unit 102 (step S121), and generates a one-frame video frameby decoding data from the read MPEG stream 221 (step S122). The imagetransform unit 107 receives the video frame (image 406) from the imagedecoder 106 and receives 2D coordinates for points forming the videodisplay surface, calculated in step S111, from the perspective transformunit 105. The image transform unit 107 then generates a transformedvideo frame (image 407) by using an affine transform to change thereceived video still image to the outline represented by the received 2Dcoordinates (step S123). The image transform unit 107 outputs thetransformed video frame to the frame buffer 108 by writing it over thearea shown by the received 2D coordinates (step S124). This enables thetransformed video frame to be pasted onto the CG image (image 404).Next, the routine returns to step S121 and repeats the above processing.

(2) Timing of Processing Performed by Each Component of the Moving ImageCombining Unit 10

FIG. 8 is a timechart showing the timing of processing performed by eachcomponent of the moving image combining unit 10. The horizontal axisshows time and the vertical axis shows the processing performed by thevarious components.

When a CG image and video frame are newly generated, and the video frameis pasted onto the CG image, image decoding C101, coordinate lightsource calculation C102 and coordinate light source calculation C105 aresimultaneously started. Here, coordinate light source calculation C102is performed by the coordinate light source calculating unit 103 tocalculate 3D coordinates for points in the 3D coordinate space A showingthe video display surface, and coordinate light source calculation C105is performed by the coordinate light source calculating unit 103 tocalculate 2D coordinates on the plane H for points forming an object,and depth values showing the distance of each of these points from theplane H in the depth direction. Once coordinate light source calculationC102 is completed, perspective transform C103 is performed, and oncethis is completed, image transform C104 is performed. Meanwhile, oncecoordinate light source calculation C105 is completed, rendering C106 isperformed. When rendering C106 and image transform C104 have beencompleted, display C107 takes place.

When a new video frame is generated and pasted onto apreviously-generated CG image, image decoding C111 and coordinate lightsource calculation C112 are simultaneously started. Here, coordinatelight source calculation C112 is performed by the coordinate lightsource calculating unit 103 to calculate 3D coordinates in the 3Dcoordinate space A for points forming a video display surface. Oncecoordinate light source calculation C112 is completed, perspectivetransform C113 is performed, and once this is completed, image transformC114 is performed. When image transform C114 has been completed, displayC117 takes place.

FIG. 8 shows a situation in which coordinate light source calculationC105 and rendering C106 are completed in a short time, in other wordswithin the decode cycle period for a video frame (the period from thestart of decoding for a video frame until the start of decoding for thenext video frame, amounting to one thirtieth of a second in the presentembodiment).

What Happens When Processing for Generating a CG Image is Lengthy

The following explains a situation in which a large number of objectsare stored in the object table 201, and processing for generating the CGimage (coordinate light source calculating and rendering) is notcompleted within the decoding cycle period for a video frame. Here, theinput unit 101 may receive input at a high rate of for example, onehundred times per second.

FIG. 29 is a timechart showing the timing of processing performed byvarious components of the moving image combining apparatus 10. As inFIG. 8, the horizontal axis shows time and the vertical axis showsprocessing performed by the various components of the moving imagecombining apparatus 10.

In the drawing, image decoding C601, coordinate light source calculationC608 and coordinate light source calculation C623 are simultaneouslystarted. Here, coordinate light source calculation C608 is solely forcalculation of points for the area on which the video frame is pasted,and coordinate light source calculation C623 is calculation for all ofthe objects. Thus, when there are a large number of objects, theprocessing time required for C623 is longer than that for C608. Inaddition, coordinate light source calculations C608, C609 and C610 areonly performed when coordinate light source calculations C623, C624 andC625 respectively are performed. Accordingly, image transforms. C618 andC619 are performed using the result of perspective transform C613 andimage transform C621 is performed using the result of perspectivetransform C614.

Next, when image transform C616 and rendering C626 have been completed,display C629 takes place. Suppose that a transformed video frame A1 isobtained as a result of image transform C616 and a CG image B1 isobtained as a result of rendering C626. In this case, a still imagewhich is a composite of the transformed video frame A1 and the CG imageB1 is displayed in display C629.

Next, image decoding C602 is started a fixed interval (for example onethirtieth of a second) after the start of image decoding C601.Coordinate light source calculation C609 is performed simultaneouslywith the start of image decoding C602, and then perspective transformC613 is performed followed by image transform C617. Suppose that atransformed video frame A2 is then obtained as a result of imagetransform C617. The subsequent display C630 takes place a fixed interval(for example one thirtieth of a second) after the start of display C629.Display C630 is started while the next coordinate light sourcecalculation C624 is being processed, so the CG image B1 of display C629is used as the displayed CG image. In other words, CG image B1 andtransformed video frame A2 are combined and displayed in display C630.

Next, image decoding C603 is started a fixed interval after the start ofimage decoding C602. Once image decoding C603 is completed, imagetransform C618 is performed based on the result of perspective transformC613, and a transformed video frame A3 obtained. Next, since renderingC627 has still not been completed when display C631 takes place, the CGimage B1 resulting from rendering C626 is used again, so that the CGimage B1 and the transformed video frame A3 are combined and displayedin display C631.

In display C632, a CG image B2 obtained as a result of rendering C627and a transformed video frame A4 resulting from image transform C619 arecombined and displayed. Subsequent display takes place in a similarfashion.

As explained above, video images can be displayed at a fixed displayrate, regardless of the time required to process graphics data.

Here, the coordinate light source calculations C608 to C610 are executedwhen coordinate light source calculations C623 to C625 start, but mayinstead be executed each time image decoding is implemented, as shown inFIG. 8. Alternatively, the calculations C608 to C610 may be executed atanother time, such as when rendering starts.

Pipeline Processing for Coordinate Light Source Calculation andRendering

Coordinate light source calculation and rendering may also be performedusing pipeline processing. The timing of processing performed by thevarious components of the moving image combining apparatus 10 in thiscase is shown by the flowchart in FIG. 30. In the drawing, as in FIG. 8,the horizontal axis shows time and the vertical axis shows processingperformed by the various components of the moving image combiningapparatus 10.

As shown in the drawing, image decoding C701, coordinate light sourcecalculation C708 and coordinate light source calculation C725 arestarted simultaneously. Once coordinate light source calculation C725has been completed, coordinate light source calculation C726 usingviewpoint coordinates and rendering C729 using the result of coordinatelight source calculation C725 are started simultaneously. Once renderingC729 is completed, display C733 takes place. Next, once coordinate lightsource calculation C726 is completed, coordinate light sourcecalculation C727 using viewpoint coordinates, and rendering C730 usingthe result of coordinate light source calculation C726 are startedsimultaneously. Subsequent processing is performed in the same way.

As explained above, coordinate light source calculation and renderingare performed using pipeline processing, so that the time required forgenerating a CG image is shortened.

1.3 Summary

As was explained, generation of a CG image and decode/transformprocessing for a video frame are performed in parallel using separateprocesses, and the generated CG image and video frame are combined inthe frame buffer 108. This means that computer graphics and video imagescan be combined at their respective display rates.

In other words, a CG image is generated ten times per second, and avideo frame thirty times per second, and each can be combined in theframe buffer whenever it is generated.

FIG. 9 shows the situation when CG images and video frames are combinedin the related art. In the drawing, a cuboid graphic rotates in theorder of images 501 to 503 and 508 to 510. A video sequence is pastedonto the video display surface of the cuboid. The display rate for thevideo sequence is fixed, so that, as shown in the drawing, several videoframes are omitted from the video sequence when the image changes from503 to 508. As a result, the video sequence does not move smoothly.

FIG. 10 shows the situation when CG images and video frames are combinedusing the moving image combining apparatus 10. In the drawing, images504 to 507, generated between the images 503 and 508 of FIG. 9, aredisplayed. As shown here, when the image changes from the image 503 tothe image 508, video frames are pasted onto CG images without omittingseveral of the frames from the video sequence, so that the videosequence appears to move smoothly.

Here, the perspective transform performed by the perspective transformunit 105 is also performed by the rendering unit 104. Therefore, ratherthan providing the prospective transform unit 105, calculation resultsfrom perspective transforms performed by the rendering unit 104 maybe,output to the image transform unit 107. Furthermore, adjustment ofthe depth direction, also known as perspective adjustment, may beperformed by the image transform unit 107 by obtaining rotationinformation from the perspective transform unit 105.

Furthermore, the coordinate light source calculating unit 103, therendering unit 104, the perspective transform unit 105, the imagedecoder 106, and the image transform unit 107 in the above embodimentare composed from the processor 13, programs stored in the semiconductormemory 14, and the like, but each of these components may alternativelybe constructed from specialized hardware.

In the above embodiment, the moving video is an MPEG stream constructedaccording to the MPEG protocol, but a different data construction may beused.

2 Second Embodiment

The following is an explanation of a digital broadcast receivingapparatus 20 in an alternative to the first embodiment of the presentinvention.

2.1 Construction of Digital Broadcast Receiving Apparatus 20

As shown in FIG. 11, the digital broadcast receiving apparatus 20includes a main unit 26, a monitor 21, a remote controller 22 and anantenna 23. The main unit 26 includes a tuner 110 for receivingbroadcast waves, having various channels each carrying a video broadcastconstructed from an MPEG stream, a CD-ROM drive in which a CD-ROM isloaded, a processor for executing programs, and a semiconductor memorystoring programs and data. The digital broadcast receiving apparatus 20reads object information related to 3D objects stored in the CD-ROM,receives a plurality of broadcast video images, and generates a CGgraphic constructed from a plurality of objects. Each object has a videodisplay surface, and a video frame is pasted onto the video displaysurface of each generated CG graphic, and the resulting image displayedon the monitor 21.

A block diagram of the digital broadcast receiving apparatus 20 is shownin FIG. 12. As shown in the drawing, the digital broadcast receivingapparatus 20 is constructed from an input unit 101, a data storage unit102, a coordinate light source calculating unit 103, a rendering unit104, a perspective transform unit 105, image decoders 106 a, 106 b and106 c, image transform units 107 a, 107 b and 107 c, a frame buffer 108,a display unit 109, a tuner 110, a priority ranking control unit 111, amasking control unit 112, masking units 113 a, 113 b and 113 c and anantenna 23.

(1) Input Unit 101

The input unit 101 is constructed from the remote controller 22 orsimilar.

As shown in FIG. 11, numbered buttons, navigation instruction buttons25, menu button 24 and the like are included on the upper face of theremote controller 22. When one of the buttons is operated by a user,information corresponding to the operated button is output to the tuner110 and the coordinate light source calculating unit 103.

The user presses one of the numbered buttons to indicate a channel onwhich a video is to be received. Pressing the menu button 24 displays aprogram menu like the one in FIG. 13 on the monitor 21, while pressingthe navigation instruction buttons 25 moves a virtual viewpoint withinthe program menu shown in the drawing, indicating movement forward,back, left, right, up and down.

(2) Data Storage Unit 102

The data storage unit 102 stores an object table 201 like the datastorage unit 102 in the moving image combining unit 10.

The object table 201 is the same as the object table 201 stored in thedata storage unit 102 in the moving image combining unit 10.

(3) Coordinate Light Source Calculating Unit 103

The coordinate light source calculating unit 103 is constructed from theprocessor 13, programs stored in the semiconductor memory 14, and thelike, in the same way as the coordinate light source calculating unit103 in the moving image combining unit 10.

The coordinate light source calculating unit 103 stores viewpointcoordinates E (Ex, Ey, Ez) in the 3D coordinate space A and alsoreceives information showing forward, back, left, right, up, down andoperation end instructions from the input unit 101.

Upon receiving information showing a forward, back, left, right, up ordown movement, the coordinate light source calculating unit 103 performsthe following calculations for each of the viewpoint coordinates Eaccording to the received information.

Ey=Ey+1

Ey=Ey−1

Ex=Ex+1

Ex=Ex−1

Ez=Ez+1

Ez=Ez−1

Furthermore, the coordinate light source calculating unit 103 reads theoutline coordinates 212, the location coordinates 213 and the videodisplay surface coordinates 214 for each object from the object table201 stored in the data storage unit 102. The coordinate light sourcecalculating unit 103 adds each value shown by the location coordinates213 to each value shown by the outline coordinates 212 and calculates 3Dcoordinates forming the objects in the 3D coordinate space A.

The coordinate light source calculating unit 103 calculates 2Dcoordinates and depth values in relation to a plane H located virtuallyin the 3D coordinate space between the objects and viewpoint coordinatesE (Ex, Ey, Ez). The 2D coordinates represent each point of objectsprojected onto the plane H seen from the direction of the viewpointcoordinates E, and the depth values represent the distance by which eachpoint is separated from the plane H in the depth direction. Next, thecoordinate light source calculating unit 103 performs clipping by usingthe 2D coordinates and depth values, thereby extracting the partsdisplayed in the display window of the monitor 15. The coordinate lightsource calculating unit 103 then outputs to the rendering unit 104 2Dcoordinates on the plane H and depth values showing the distance fromplane H in the depth direction for points belonging to each object whichhas been clipped. Clipping and the method used to calculate the 2Dcoordinates and the depth direction values are well-known in the art andso explanation of these processes is omitted here.

Similarly, the coordinate light source calculating unit 103 addscoordinate values shown by the location coordinates 213 to coordinatevalues shown by the outline coordinates 212, thereby calculating 3Dcoordinate values for points in the 3D coordinate space A forming eachobject, and outputs the calculated 3D coordinates to the perspectivetransform unit 105 and the priority ranking control unit 111.

(4) Rendering Unit 104

The rendering unit 104 is the same as the rendering unit 104 in themoving image combining apparatus 10, and so explanation is omitted here.

(5) Perspective Transform Unit 105

The perspective transform unit 105 is constructed from the processor 13,programs stored in the semiconductor 14, and the like, in the same wayas the perspective transform unit 105 in the first embodiment.

The perspective transform unit 105 receives 3D coordinates for pointsforming each object in the 3D coordinate space A from the coordinatelight source calculating unit 103, calculates 2D coordinates on theplane H for points forming the video display surface for each object inthe same way as the coordinate light source calculating unit 103, andoutputs the calculated 2D coordinates forming the video display surfaceof each object to the corresponding image transform unit 107 a to 107 cand the calculated 2D coordinates for all of the video display surfacesto the priority ranking control unit 111.

(6) Priority Ranking Control Unit 111

The priority ranking control unit 111 receives 3D coordinates for pointsforming each object in the 3D coordinate space A from the coordinatelight source unit 103, and 2D coordinates for points on the plane Hforming the video display surface for each object from the perspectivetransform unit 105.

The priority ranking control unit 111 determines a representative valuefor each object by selecting the largest Z coordinate value from the Zcoordinate values of the points forming the video display surface. Next,the priority ranking control unit 111 ranks objects in order startingwith the object having the smallest representative value. Thus the videodisplay surface of each object is given a ranking.

Next, the priority ranking control unit 111 detects objects withoverlapping video display surfaces, and determines which of the detectedobjects has the video display surface nearest the front of the 3Dcoordinate space, by referring to 3D coordinates for points forming theobjects. Furthermore, the priority ranking control unit 111 leaves thepriority ranking of the object whose moving image display is nearest thefront unchanged, while lowering the priority ranking of the otherobjects with overlapping video display surfaces.

When objects are ranked in this way, objects nearer to plane H in the 3Dcoordinate space will have a higher ranking. The priority ranking foreach image decoder is determined based on this ranking, and output tothe image decoder 106 a to 106 c corresponding to each object.

(7) Antenna 23 and Tuner 110

The antenna 23 receives broadcast waves and outputs them to the tuner110.

Upon receiving information corresponding to the menu button 24 from theinput unit 101, the tuner 110 selects video sequences constructed fromthree MPEG streams broadcast on three channels from broadcast wavesreceived by the antenna 23, and outputs each of the three selected videosequences to one of the image decoders 106 a to 106 c.

(8) Image Decoders 106 a, 106 b, 106 c

The image decoder 106 a is constructed from the processor 13, programsstored in the semiconductor memory 14, and the like, in the same way asthe image decoder 106 in the moving image combining unit 10.

The image decoder 106 a receives a priority ranking from the priorityranking control unit 111.

The image decoder 106 a also receives a video sequence formed from oneMPEG stream from the tuner 110. The image decoder. 106 a then repeatedlygenerates video frames by decoding data from the read MPEG stream, andoutputs generated video frames to the image transform unit 107 a,according to the received priority ranking.

If a highest priority ranking is received, the image decoder 106 adecodes all of the video frames from the MPEG stream.

If a medium priority ranking is received, the image decoder 106 adecodes every other video frame from the MPEG stream.

If a low priority ranking is received, the image decoder 106 a decodesone in every four video frames from the MPEG stream. In other words, theimage decoder 106 a skips three out of every four video frames.

In this way, more video frames from the MPEG stream are skipped at alower priority ranking, leaving a greater number of video framesundecoded.

The image decoders 106 b and 106 c are identical to the image decoder106 a.

(9) Image Transform Units 107 a, 107 b and 107 c

The image transform unit 107 a is constructed from the processor 13,programs stored in the semiconductor memory 14, and the like, in thesame way as the image transform unit 1075 in the moving image combiningapparatus 10.

The image transform unit 107 a receives a video frame from the imagedecoder 106 a and 2D coordinates for points forming the video displaysurface from the perspective transform unit 105. Next the imagetransform unit 107 changes the received video frame to the outlinerepresented by the received 2D coordinates using an affine transform,thereby generating a transformed video frame. The transformed videoframe is output to the masking unit 113 a.

The image transform units 107 b and 107 c are identical to the imagetransform unit 107 a.

(10) Masking Control Unit 112

The masking control unit 112 is constructed from the processor 13,programs stored in the semiconductor memory 14, and the like.

The masking control unit 112 receives 3D coordinates for points formingobjects in the 3D coordinate space A from the coordinate light sourcecalculating unit 103, and receives 2D coordinates on the plane H forpoints forming video display surfaces for each object from theperspective transform unit 105. The masking control unit 112 detectsoverlapping objects and calculates masked areas using the received 3Dcoordinates and 2D coordinates, a masked area being the area of eachvideo display surface that cannot be seen as it is concealed behindanother object. The masking control unit 112 then outputs the calculatedmasked area for each object to the masking unit 113 a to 113 ccorresponding to the video display surface for each object.

(11) Masking Units 113 a, 113 b, 113 c

The masking unit 113 a is constructed from the processor 13, programsstored in the semiconductor memory 14, and the like.

The masking unit 113 a receives 2D coordinates for points on plane Hforming the video display surface for each object from the perspectivetransform unit 105.

Furthermore, the masking unit 113 a receives a transformed video framefrom the image transform unit 107 a and a masked area from the maskingcontrol unit 112. The masking unit 113 a then sets all of the pixelvalues in the area of the transformed video frame shown by the maskedarea at 0. Next, the masking control unit 112 outputs the transformedvideo image, in which all of the pixel values in the masked area havebeen set at 0, to the frame buffer 108 by writing it over the area shownby the received 2D coordinates.

The masking units 113 b and 113 c are identical to the masking unit 113a.

(12) Frame Buffer 108

The frame buffer 108 is identical to the frame buffer 108 in the movingimage combining apparatus 10.

(13) Display Unit 109

The display unit 109 displays a screen 321, as shown in FIG. 13. Objects332, 333 and 334 are displayed in the screen 321, and each object has avideo display surface, numbered 322, 323 and 324 respectively. A videosequence is displayed on each of the video display surfaces.

2.2 Operation of Digital Broadcast Receiving Apparatus 20

(1) Operation of Digital Broadcast Receiving Apparatus 20

The operation of the digital broadcast receiving apparatus 20 isexplained with reference to FIGS. 14 to 16. FIG. 14 shows data for eachprocess performed by the digital broadcast receiving apparatus 20, FIG.15 is a flowchart showing the operation of the digital broadcastreceiving apparatus 20, and FIG. 16 is a flowchart showing the operationof the priority ranking control unit 111 in the digital broadcastreceiving apparatus 20.

Steps in the flowchart shown in FIG. 15 having the same numericalreferences as steps in the flowchart of FIG. 7 have the same processingcontent. The following explanation concentrates on the differences fromthe flowchart of FIG. 7. 3D coordinates 412 for points forming objectsin the 3D space A are calculated based on object information 411 storedin the data storage unit 102, and coordinates 414 forming video displaysurfaces are then calculated based on these 3D coordinates 412. A CGimage 413 is formed based on the 3D coordinates 412.

Once calculation of the coordinates 414 forming the video displaysurfaces is completed in step S111, the priority ranking control unit111 determines priority rankings for each image decoder, and outputs thedetermined priority rankings to the corresponding image decoders (stepS201). Next, the image decoder 106 a receives an MPEG stream (step S121a) and decodes the MPEG stream, generating a video frame 415 (step S122a). The image decoder 106 a determines whether the video frame is to bereproduced by referring to the priority ranking (step S202 a). If thevideo frame is not to be reproduced, processing returns to step S201. Ifthe video frame is to be reproduced, the image transform unit 107 atransforms it, generating a transformed video frame 416 (step S123 a).The masking control unit 113 a generates a transformed video frame 417on which masking has been implemented (step S203 a), and writes it inthe frame buffer 108 (step S124 a). Processing then returns to stepS201.

In the same way, steps S121 b to S122 b, S202 b, S123 b, S203 b and S124b generate a video frame 418, generate a transformed video frame 419 andgenerate a transformed video frame 420 on which masking has beenimplemented before writing it in the frame buffer 108. Also in a similarway, steps S121 c to S122 c, S202 c, S123 c, S203 c and S124 c generatea video frame 421, generate a transformed video frame 422 and generate atransformed video frame 423 on which masking has been implemented beforewriting it in the frame buffer 108.

In this way, a still image 425, where the three video frames have beenpasted onto the video display surfaces of three objects in the CG image,is generated in the frame buffer 108.

The following is an explanation of the priority ranking determiningoperation performed in step S201 by the priority ranking control unit111.

The priority ranking control unit 111 determines a representative valuefor each object as the largest Z coordinate value from the Z coordinatevalues for points forming the video display surface, and ranks theobjects in order, starting with the object having the smallestrepresentative value (step S211). Next, the priority ranking controlunit 111 detects objects with overlapping video display surfaces usingthe 3D coordinates for points forming each object (step S212). Thepriority ranking control unit 111 then determines which of the objectswith overlapping video display surfaces is nearest the front of the 3Dcoordinate space (step S213), and leaves the priority ranking of thevideo display surface nearest the front unchanged, while lowering thepriority ranking of the other objects with overlapping video displaysurfaces (step S214).

(2) Timing of Processing Performed by Each Component of the DigitalBroadcast Receiving Apparatus 20

FIG. 17 is a timechart showing processing timing for various componentsof the digital broadcast receiving apparatus 20. The horizontal axisshows time and the vertical axis shows processing performed by variouscomponents of the digital broadcast receiving apparatus 20.

When a CG image and video frames are newly generated, and the videoframes are pasted onto the CG image, coordinate light source calculationC204 and coordinate light source calculation C211 are simultaneouslystarted. Here, coordinate light source calculation C204 is performed bythe coordinate light source calculating unit 103 to calculate 3Dcoordinates for points in the 3D coordinate space A showing videodisplay surfaces, and coordinate light source calculation C211 isperformed by the coordinate light source calculating unit 103 tocalculate 2D coordinates on the plane H for points forming objects, anddepth values showing the distance of each point from the plane H in thedepth direction. Once the coordinate light source calculation C204 iscompleted, perspective transform C205 is performed, and once this iscompleted, priority ranking control C206 and masking control C207 aresimultaneously started. In addition, once priority ranking control C206is completed, image decoders 106 a to 106 c start image decoding C201,C202 and C203. When image decoding C201, C202 and C203 have beencompleted, image transform/masking C208, C209 and C210 are started.Meanwhile, once coordinate light source calculation C211 is completed,rendering C212 is performed. Once rendering C212 and imagetransform/masking C208, C209 and C210 are completed, display C213 takesplace.

When new video frames are generated and pasted onto apreviously-generated CG image, coordinate light source calculation C224is started. Here, coordinate light source calculation C224 is performedby the coordinate light source calculating unit 103 to calculate 3Dcoordinates in the 3D coordinate space A for points forming videodisplay surfaces. Once coordinate light source calculation C224 iscompleted, perspective transform C225 is performed, and once this iscompleted, priority ranking control C226 and masking control C227 aresimultaneously started. When priority ranking control C226 is completed,the image decoders 106 a to 106 c start image decoding C221, C222 andC223, and once this is completed, image transform/masking C228, C229 andC230 are started. Once image transform/masking C228, C229 and C230 arecompleted, display C231 is performed.

2.3 Summary

As explained above, generation of a CG image and decode/transformprocessing for a plurality of video frames are performed in parallelusing separate processes, and the generated CG image and plurality ofvideo frames are combined in the frame buffer 108. This means thatcomputer graphics and images from a plurality of video sequences can becombined at their respective display rates. Additionally, the videodisplay surface nearest the front of the 3D coordinate space has ahigher priority ranking, and the number of frames decoded by thecorresponding image decoder in a fixed time is increased, so that imagequality increases as the priority ranking is raised.

The priority ranking control unit 111 is described as giving the videodisplay surface nearest the front of the 3D coordinate space the highestpriority ranking, using z coordinate values as a reference, butalternatively the surface area of the video display surface may be usedas a reference, so that the areas of the video display surfaces arecalculated and video display surfaces with larger areas given a higherpriority ranking.

The coordinate light source calculating unit 103, the rendering unit104, the perspective transform unit 105, the image decoders 106 a to 106c, the image transform units 107 a to 107 c, the priority rankingcontrol unit 111, the masking control unit 112, and the masking units113 a to 113 c in this embodiment are constructed from a processor,programs stored in a semiconductor memory, and the like, but they mayeach be constructed from specialist hardware.

Alternatively, the image decoder 106 a to 106 c having a high priorityranking may be constructed from specialist hardware, and image decoderswith a low priority ranking from the processor, and programs stored inthe semiconductor memory. This enables decoding of video sequences witha high priority ranking to be performed quickly.

3 Third Embodiment

The following is an explanation of a moving image combining apparatus 30in a further alternative to the first embodiment of the presentembodiment.

3.1. Construction of Moving Image Combining Apparatus 30

The moving image combining apparatus 30, like the moving image combiningapparatus 10, is constructed from a main unit 11, a CD-ROM drive 12 inwhich a CD-ROM is loaded, a processor 13 executing programs, asemiconductor memory 14 storing programs and data, a monitor 15, akeyboard 16, speakers 17 and a mouse 18. The moving image combiningapparatus 30 reads object information for a three-dimensional object andvideo frames recorded on the CD-ROM, generates a CG image, pastes avideo frame onto the CG image and displays the combined image on themonitor 15.

FIG. 18 is a block diagram of the moving image combining apparatus 30.In the drawing, the moving image combining apparatus 30 is constructedfrom an input unit 101, a data storage unit 102, a coordinate lightsource calculating unit 103, a rendering unit 104, a perspectivetransform unit 105, an image decoder 106, an image transform unit 107, aframe buffer 108, a display unit 109, a control data storage unit 114, avideo frame storage unit 115, a graphics storage unit 116 and aselection unit 117.

The components of the moving image combining apparatus 30 having thesame numerical reference as components of the moving image combiningapparatus 10 have the same construction, so the following explanationconcentrates on the differences from the components of the moving imagecombining apparatus 10.

(1) Graphics Storage Unit 116

The graphics storage unit 116 has a graphics area A116 a and a graphicsarea B116 b, each storing a CG image.

(2) Rendering Unit 104

Rather than outputting generated CG images to the frame buffer 108, therendering unit 104 outputs them alternately to the graphics areas A116 aand B116 b.

(3) Control Data Storage Unit 114

The control data storage unit 114 has a control data area A114 a and acontrol data area B114 b, each storing control data.

(4) Perspective Transform Unit 105

The perspective transform unit 105 further generates control data, andoutputs the generated control data alternately to control data areasA114 a and B114 b, as explained below.

One example of the control data is control data 601 shown in FIG. 19.The control data 601 is a 640×480-bit data sequence, totaling 307 200bits. Each bit has a value of either 1 or 0 and corresponds to a pixelin the CG image generated by the rendering unit 104.

The perspective transform unit 105 sets the value of bits in the controldata 601 corresponding to the video display surface at 1, and the valuesof all other bits at 0.

(5) Video Frame Storage Unit 115

The video frame storage unit 115 has a video frame area A115 a and avideo frame storage area B115 b, each storing a video frame.

(6) Image Transform Unit 107

Rather than outputting generated transformed video frames to the framebuffer 108 by writing them over the area shown by the received 2Dcoordinates, the image transform unit 107 outputs the transformed imagesalternately to the video frame area A115 a and the video frame area B115b.

(7) Selection Unit 117

The selection unit 117 reads CG images alternately from the graphicsarea A116 a and the graphics area B116 b, video frames alternately fromthe video frame area A115 a and the video frame area B115 b and controldata alternately from the control data area A114 a and the control dataarea B114 b.

The selection unit 117 determines whether each bit of the read controldata is 1 or 0. If a bit is 0, the selection unit 117 reads a pixel at alocation corresponding to the bit from the pixels forming the read CGimage, and writes the read pixel at a corresponding location in theframe buffer 108. If a bit is 1, the selection unit 117 reads a pixel ata location corresponding to the bit from the pixels forming the readvideo frame and writes the read pixel at a corresponding location in theframe buffer 108.

3.2 Operation of the Moving Image Combining Apparatus 30

The following is an explanation of the moving image combining apparatus30, with reference to FIGS. 20 to 24.

(1) Shape of the Data in Each Processing Performed by the Moving ImageCombining Apparatus 30

FIG. 20 shows data in each processing performed by the moving imagecombining apparatus 30.

As shown in the drawing, the coordinate light source calculating unit103 calculates 3D coordinates 452 for points forming an object in the 3Dcoordinate space A using object information 451. The rendering unit 104performs rendering, generating a CG image formed of bitmap data, andoutputs the generated CG image to the graphics area B116 b. Theperspective transform unit 105 calculates 2D coordinates 453 on theplane H for points forming a video display surface, and generatescontrol data before outputting it to the control data area B114 b. Theimage transform unit 107 transforms a video frame 454 into a transformedvideo frame and outputs the transformed video frame to the video framearea B115 b.

Meanwhile, the selection unit 117 reads a CG image from the graphicsarea A116 a, a video frame from the video frame area A115 a, and controldata from the control data area A114 a. The selection unit 117 combinesthe CG image and the video frame using the read control data and writesthe combined image in the frame buffer 108.

At other times, the rendering unit 104 outputs a generated CG image tothe graphics area A116 a, the perspective transform unit 105 outputscontrol data to the control data area A114 a and the image transformunit 107 outputs a transformed video frame to the video frame area A115a. Meanwhile, the selection unit 117 reads a CG image from the graphicsarea B116 b, a video frame from the video frame area B115 b, and controldata from the control data area B114 b. The selection unit 117 combinesthe CG image and the video frame us the read control data, and writesthe combined image in the frame buffer 108.

In this way, the outputting of data to the graphics area A116 a, thecontrol data area A114 a, and the video frame area A115 a and thereading of data from the graphics area B116 b, the control data areaB114 b and the video frame area B115 b alternate with the outputting ofdata to the graphics area B116 b, the control data area B114 b, and thevideo frame area B115 b, and the reading of data from the graphics areaA116 a, the control data area A114 a, and the video frame area A115 a.

(2) Relationship Between CG Images, Video Frames and Control Data

FIG. 21 shows the relationship between CG images, video frames andcontrol data in the moving image combining apparatus 30.

In the drawing, a bit 473 a in control data 473 is set at 0. The bit 473a corresponds to a part 471 a of a CG image 471 that is not a videodisplay surface. The part 471 a is written in the frame buffer 108.

A bit 473 b in control data 473 is set at 1. The bit 473 b correspondsto a part 472 a in a video frame 472. The part 472 a is written in theframe buffer 108.

(3) Operation of Moving Image Combining Apparatus 30

FIG. 22 is a flowchart showing the operation of the moving imagecombining apparatus 30.

The coordinate light source calculating unit 103 reads outlinecoordinates 212, location coordinates 213 and video display surfacecoordinates 214 for each object from the object table 201 in the datastorage unit 102 (step S101), and receives information showing aforward, back, left, right, up, down or operation end instruction fromthe input unit 101 (step S102). If information for an operation endinstruction is received, the coordinate light source calculating unit103 ends processing (step S103). If information for another type ofinstruction is received (step S103) and graphics calculation can takeplace at this time (step S301), the coordinate light source calculatingunit 103 calculates viewpoint coordinates E according to the receivedinformation, 3D coordinates for points forming an object in the 3Dcoordinate space A, 2D coordinates for points formed on the plane H, andthe depth values showing the distance of each point from the plane H inthe depth direction, and clips the object (step S104). The renderingunit 104 performs rendering such as deletion of hidden lines andsurfaces, display of surface shading, display of surface color, andtexture mapping using the 2D coordinates and depth values, forming a CGimage as a bitmap image (step S105). The rendering unit 104 then outputsthe CG image to the CG image area A116 a or the CG image area B116 b(step S106). Next, the routine returns to step S102 and the aboveprocessing is repeated.

Following step S104, the coordinate light source calculating unit 103also calculates 3D coordinates for points forming a video displaysurface in the 3D coordinate space A, and the perspective projectionunit 105 calculates 2D coordinates on the plane H for points forming thevideo display surface (step S111). Control then moves to step S123. Theperspective projection unit 105 generates control data (step S304). Theroutine then returns to step S102 and the above processing is repeated.

If graphics calculation cannot take place at this time (step S301), theroutine returns once more to step S102, and processing is repeated.

Meanwhile, the image decoder 106 reads the MPEG stream 221 stored in thedata storage unit 102 (step S121), and. repeatedly generates videoframes by decoding data from the read MPEG stream 221 (step S122). Theimage transform unit 107 receives a video frame from the image decoder106 and receives the 2D coordinates for points forming the video displaysurface calculated in step S111 from the perspective transform unit 105.The image transform unit 107 then generates a transformed video frame byusing an affine transform to change the received video frame to theoutline represented by the received 2D coordinates (step S123). Theimage transform unit 107 outputs the transformed video frame to thevideo frame area A115 a or the video frame area B115 b (step S124).Next, the routine returns to step S102 and the above processing isrepeated.

In addition, the selection unit 117 reads a CG image from either thegraphics area A116 a or the graphics area B116 b, a video frame fromeither the video frame area A115 a or the video frame area B115 b, andcontrol data from either the control data area A114 a or the controldata area B114 b. The selection unit 117 then combines the CG image andthe video frame using the read control data, and writes the combinedstill image in the frame buffer 108 (step S305). The display unit 109reads the still image from the frame buffer 108 and displays it (stepS306). The routine then returns to step S102 and the above processing isrepeated.

FIG. 23 is a flowchart showing the operation for image combiningperformed by the moving image combining apparatus 30.

The selection unit 117 repeats steps S312 to S314 explained below foreach pixel of the still image written in the frame buffer 108.

When a bit C (x, y) in the control data is 1 (step S312), the selectionunit 117 sets a pixel F (x, y) in the still image stored in the framebuffer 108 as a pixel V (x, y) from the video frame (step S313). Whenthe bit C in the control data is 0 (step S312), the selection unit 117sets the pixel F (x, y) in the still image stored in the frame buffer108 as a pixel G (x, y) from the CG image (step S314). Here, (x, y) arecoordinates showing a location in the still image.

(4) Timing of Processing Performed by Various Components of Moving ImageCombining Apparatus 30

FIG. 24 is a timechart showing timing for processing performed byvarious components of the moving image combining apparatus 30. Thehorizontal axis shows time, and the vertical axis shows processingperformed by the various components of the moving image combiningapparatus 30.

When a CG image and a video frame are newly generated, and the videoframe is pasted onto the CG image, image decoding C301, coordinate lightsource calculation C302, coordinate light source calculation C305 andcombining C307 are simultaneously started. Here, coordinate light sourcecalculation C302 is performed by the coordinate light source calculatingunit 103 to calculate 3D coordinates for points in the 3D coordinatespace A showing the video display surface, and coordinate light sourcecalculation C305 is performed by the coordinate light source calculatingunit 103 to calculate 2D coordinates on the plane H for points formingan object and depth values showing the distance of each point from theplane H in the depth direction. Once coordinate light source calculationC302 is completed, perspective transform C303 is performed, and oncethis is completed, image transform C304 is performed. Furthermore, oncecoordinate light source calculation C305 is completed, rendering C306 isperformed. Once combining C307 is completed, display C308 takes place.

When a new video frame is generated and pasted onto apreviously-generated CG image, image decoding C311, coordinate lightsource calculation C312 and combining C317 are simultaneously started.Here, coordinate light source calculation C312 is performed by thecoordinate light source calculating unit 103 to calculate 3D coordinatesin the 3D coordinate space A for points forming the video displaysurface. Once coordinate light source calculation C312 is completed,perspective transform C313 is performed, and once this is completed,image transform C314 is performed. Once combining C317 is completed,display C318 takes place.

3.3 Summary

As explained above, generation of a CG image, decode/transformprocessing for a video frame, and combining of the generated CG imageand the video frame are performed in parallel using separate processes.This means that computer graphics and a video image can be combined attheir respective display rates, and generation of a CG image,decode/transform processing for a video frame, and combining of thegenerated CG image and the video frame can be performed more quickly.

4 Fourth Embodiment

The following is an explanation of a digital broadcast receivingapparatus 40 in a further alternative to the first embodiment of thepresent invention.

4.1 Construction of Digital Broadcast Receiving Apparatus 40

The digital broadcast receiving apparatus 40, like the moving imagecombining apparatus 20, is constructed from a main unit 26, a monitor21, a remote controller 22, an antenna 23 and the like. The moving imagecombining apparatus 40 reads object information concerning 3D objectsrecorded on a CD-ROM, and receives a plurality of broadcast videosequences. The digital broadcast receiving apparatus 40 generates a CGimage formed from a plurality of objects, each with a video displaysurface, pastes a video frame onto each of the video display surfaces inthe generated graphic and displays the combined image on the monitor 21.

In the present embodiment, the digital broadcast receiving apparatus 40receives first, second and third video sequences, and each of first,second and third objects has first, second and third video displaysurfaces.

FIG. 25 is a block diagram of the digital broadcast receiving apparatus40. In the drawing, the digital broadcast receiving apparatus 40 isconstructed from an input unit 101, a data storage unit 102, acoordinate light source calculating unit 103, a rendering unit 104, aperspective transform unit 105, image decoders 106 a, 106 b and 106 c,image transform units 107 a, 107 b and 107 c, a frame buffer 108, adisplay unit 109, a tuner 110, a priority ranking control unit 111, acontrol data storage unit 114, a video frame storage unit 115, agraphics storage unit 116, a selection unit 117 and an antenna 23.

The components of the digital broadcast receiving apparatus 40 havingthe same numerical references as components of the digital broadcastreceiving apparatus 20 have the same construction. Furthermore, thecontrol data storage unit 114, the video frame storage unit 115, thegraphics storage unit 116, and the selection unit 117 are the same asthose in the moving image combining apparatus 30. In other words, thedigital broadcast receiving apparatus 40 is a combination of the digitalbroadcast receiving apparatus 20 and the moving image combiningapparatus 30.

The following explanation concentrates on the differences from thecomponents of the digital broadcast receiving apparatus 20.

(1) Rendering Unit 104

Rather than outputting generated CG images to the frame buffer 108, therendering unit 104 outputs them alternately to the graphics areas A116 aand B116 b.

(2) Perspective Transform Unit 105

The perspective transform unit 105 further generates control data, andoutputs the generated control data alternately to the control data areasA114 a and B114 b, in the same way as the perspective transform unit 105in the moving image combining apparatus 30.

Here, the control data is, for example, a 640×480 data array, totaling307 200 items. Each item of control data is formed from two bits and somay be 0, 1, 2, or 3. Each item corresponds to a pixel in the CG imagegenerated by the rendering unit 104.

The perspective transform unit 105 sets values in the control data sothat items corresponding to the first video display surface are set at avalue of 1, items corresponding to the second video display surface at avalue of 2 and items corresponding to the third video display surface ata value of 3. All other items are set at a value of 0. Parts of thecontrol data where a plurality of moving image surfaces overlap aregiven the value of the uppermost video display surface.

(3) Image Transform Units 107 a, 107 b, 107 c

Rather than outputting transformed video frames to the masking unit 113a, the image transform unit 107 a outputs them alternately to the videoframe areas A115 a and B115 b.

The image transform units 107 b and 107 c are identical to the imagetransform unit 107 a.

(4) Selection Unit 117

The selection unit 117 performs the following processing for item ofread control data. If the item is 0, the selection unit 117 reads apixel at a location corresponding to the item from the pixels making upthe read CG image, and writes the read pixel at a corresponding locationin the frame buffer 108. If the item is 1, the selection unit 117 readsa pixel at a location corresponding to the item from the pixels makingup the first video frame, and writes the read pixel at a correspondinglocation in the frame buffer 108. If the item is 2, the selection unit117 reads a pixel at a location corresponding to the item from thepixels making up the second video frame, and writes the read pixel at acorresponding location in the frame buffer 108. If the item is 3, theselection unit 117 reads a pixel at a location corresponding to the itemfrom the pixels making up the third video frame, and writes the readpixel at a corresponding location in the frame buffer 108.

4.2 Operation of Digital Broadcast Receiving Apparatus 40

(1) Data in Each Process Performed by Digital Broadcast ReceivingApparatus 40

The following is an explanation of the operation of the digitalbroadcast receiving apparatus 40 with reference to FIG. 26. The drawingshows data in each process performed by the digital broadcast receivingapparatus 40.

As shown in the drawing, the coordinate light source calculating unit103 calculates 3D coordinates 704 for points forming objects in the 3Dspace A using object information 701. The rendering unit 104 performsrendering, forms a CG image 702 as a bitmap image, and outputs the CGimage 702 to the graphics area B116 b. The perspective transform unit105 calculates 2D coordinates for points on the plane H forming videodisplay surfaces, generates control data 705, and outputs control data705 to the control data area B114 b. The image transform units 107 a to107 c respectively generate transformed video frames 708, 712 and 716from video frames 707, 711 and 715 respectively, and output thetransformed video frames 708, 712 and 716 to the video frame area B115b.

Meanwhile, the selection unit 117 reads a CG image 703 from the graphicimage area A116 a, video frames 710, 714 and 718 from the video framearea A115 a, and control data 706 from the control data area A114 a. Theselection unit 117 generates a still image 719 using the read controldata 706, thereby combining the CG image 703 with the video frames 710,714 and 718, and writes the still image 719 in the frame buffer 108.

(2) Timing of Various Processing Performed by Digital BroadcastReceiving Apparatus 40

FIG. 27 is a timechart showing the timing of processing performed byvarious components of the digital broadcast receiving apparatus 40. Thehorizontal axis shows time and the vertical axis shows processingperformed by the various components of the digital broadcast receivingapparatus 40.

When a CG image and video frames are newly generated, and the videoframes are pasted onto the CG image, coordinate light source calculationC404, coordinate light source calculation C410 and combining C412 aresimultaneously started. Here, coordinate light source calculation C404is performed by the coordinate light source calculating unit 103 tocalculate 3D coordinates for points in the 3D coordinate space A showingvideo display surfaces, and coordinate light source calculation C410 isperformed by the coordinate light source calculating unit 103 tocalculate 2D coordinates on the plane H for points forming objects, anddepth values showing the distance of each point from the plane H in thedepth direction. Once the coordinate light source calculation C404 iscompleted, perspective transform C405 is performed, and once this iscompleted, priority ranking control C406 is started. Once priorityranking control C406 is completed, the image decoders 106 a to 106 cstart image decoding C401, C402 and C403. Once image decoding C401, C402and C403 is completed, image transforms C407, C408 and C409 are started.Meanwhile, once coordinate light source calculation C410 is completed,rendering C411 is performed. Once combining C412 is completed, displayC413 takes place.

When new video frames are generated and pasted onto apreviously-generated CG image, coordinate light source calculation C424and combining C432 are simultaneously started. Here, coordinate lightsource calculation C424 is performed by the coordinate light sourcecalculating unit 103 to calculate 3D coordinates in the 3D coordinatespace A for points forming the video display surfaces. Once coordinatelight source calculation C424 is completed, perspective transform C425is performed, and once this is completed, priority ranking control C426is started. Next, once priority ranking control C426 is completed, theimage decoders 106 a to 106 c start image decoding C421, C422 and C423.Once image decoding C421, C422 and C423 is completed, image transformsC427, C428 and C429 are started, and once combining C432 is completed,display C433 takes place.

4.3 Summary

As explained above, generation of a CG image, decode/transformprocessing for a plurality of video frames, and combining of thegenerated CG image and the plurality of video frames are performed inparallel using separate processes. This means that computer graphics andimages from a plurality of video sequences can be combined at theirrespective display rates, and generation of a CG image, decode/transformprocessing for video frames, and combining of the generated CG image andvideo frames can be performed more quickly. Since a video displaysurface nearest the front of the 3D coordinate space is given a higherpriority ranking, and the corresponding decoder is made to decode thevideo sequence at a higher frame rate, video images can be displayedwith higher quality on video display surfaces with a higher priorityranking.

5 Fifth Embodiment

The following is an explanation of a digital broadcast receivingapparatus 50 in a further alternative to the first embodiment of thepresent invention.

5.1 Construction of Digital Broadcast Receiving Apparatus 50

The digital broadcast receiving apparatus 50 includes a main unit 26, amonitor 21, a remote controller 22 and an antenna 23, in the same way asthe digital broadcast receiving apparatus 20. The digital broadcastreceiving apparatus 50 reads object information for 3D objects recordedon a CD-ROM, receives a plurality of video frames, generates a CG imageformed from a plurality of objects, each having a video display surface,pastes the video frames onto each video display surface in the CG imageand displays the combined image on the monitor 21.

FIG. 28 is a block diagram of the digital broadcast receiving apparatus50. In the drawing, the digital broadcast receiving apparatus 50 isconstructed from an input unit 101, a data storage unit 102, acoordinate light source calculating unit 103, a rendering unit 104, aperspective transform unit 105, image decoders 106 a, 106 b and 106 c,image transform units 107 a, 107 b and 107 c, a frame buffer 108, adisplay unit 109, a tuner 110, a priority ranking control unit 111, amasking control unit 112, masking units 113a, 113 b and 113 c, imageadjustment units 118 a, 118 b and 118 c and an antenna 23.

The components of the digital broadcast receiving apparatus 50 havingthe same numerical references as components of the digital broadcastreceiving apparatus 20 in the second embodiment have the sameconstruction. The following explanation concentrates on the differencesbetween the two embodiments.

(1) Image Decoders 106 a, 106 b and 106 c

Rather than outputting generated video frames to the image transformunit 107 a, the image decoder 106 a outputs them to the image adjustmentunit 118 a. In other respects, the image decoder 106 a is identical tothe image decoder 106 a in the digital broadcast receiving apparatus 20.

The image decoders 106 b and 106 c are identical to the image decoder106 a.

(2) Priority Ranking Control Unit 111

The priority ranking control unit 111 outputs a determined priorityranking to the image adjustment unit 118 a to 118 c corresponding to thevideo display surface for each object.

(3) Image Adjustment Units 118 a, 118 b, 118 c.

The image adjustment unit 118 a receives a video frame from the imagedecoder 106 a and a determined priority ranking from the priorityranking control unit 111.

When the received priority ranking is the highest possible, the imageadjustment unit 118 a outputs the received video frame the imagetransform unit 107 a without alteration.

When the received priority ranking is medium or low, the imageadjustment unit 118 a adjusts the luminance of the video image so thatvideo images with a lower priority ranking have lower luminance.Basically, this means that the value of each pixel in the video frame isdivided by an appropriate value. Here, examples of this value are 4 fora medium priority ranking, and 8 for a low priority ranking.Alternatively, when the priority ranking is medium, the pixel value maybe shifted one bit downwards, and when it is low, the pixel value may beshifted two bits downwards.

5.2 Summary

As explained above, generation of a CG image, and decode/transformprocessing for a plurality of video frames are performed in parallelusing separate processes, and the generated CG image and the pluralityof video frames are combined in the frame buffer. This means thatcomputer graphics and images from a plurality of video sequences can becombined at their respective display rates. Since a video displaysurface nearest the front of the 3D coordinate space is given a higherpriority ranking, and the corresponding image decoder is made to decodethe video sequence at a higher frame rate, video images can be displayedwith higher quality on video display surfaces with a higher priorityranking. In addition, the luminance of video display surfaces with a lowpriority ranking is adjusted to a low level, so that flicker is lessnoticeable for video display surfaces with a low priority ranking likelyto have a lower display rate.

6 Further Alternatives

(1) The combining of computer graphics and video images explained in theabove embodiments may be applied in TV game machines, image reproductionapparatuses such as DVD, video CD and CD players, or in informationprocessing terminals.

(2) In the above embodiments, a video sequence is pasted onto a videodisplay surface of an object, but computer graphics and a video sequencemay be displayed side by side on a television screen.

(3) In the above embodiments, the display rate for video images isthirty frames per second, and the display rate for computer graphics isten frames per second, but different display rates may of course beused. For example, the display rate for video images may be set at tenframes per second and the display rate for computer graphics at thirtyframes per second.

(4) In the above embodiments, an object has one video display surface,but an object may have a plurality of video display surfaces.

(5) The invention may alternatively be embodied in a moving imagecombining method that uses the procedures described in the aboveembodiments. This moving image combining method may be a moving imagecombining program executed by a computer, or a computer-readablerecording medium recording the moving image combining program. Thecomputer-readable recording medium may be a floppy disk, CD-ROM,DVD-ROM, DVD-RAM, semiconductor memory or similar. The moving imagecombining program may be transmitted Via a communication path in theform of a digital signal or similar.

(6) The present invention may be formed from any combination of theabove plurality of embodiments and alternatives.

INDUSTRIAL APPLICABILITY

The present invention may be used as a user interface for selectingprograms in a digital broadcast receiving apparatus receiving digitalbroadcast waves broadcast on a plurality of channels. It may also beused as an image processing means for producing more advanced images ina TV game machine, an image reproduction apparatus such as a DVD, videoCD, or CD player, a personal computer or an information processingterminal.

What is claimed is:
 1. A moving image combining apparatus combiningcomputer graphic images (hereafter referred to as CG images) and atleast one video sequence composed of a plurality of video frames, themoving image combining apparatus comprising: an information storagemeans for storing object information showing an outline and location forat least one object in three-dimensional (3D) space; a video obtainingmeans for obtaining from an external source at least one video sequencecomposed of a plurality of video frames generated at a fixed videodisplay rate; an image storage means; a receiving means for receivingposition information showing a position of a moving viewpoint; agraphics generating means for generating CG images one at a time at agraphics display rates and, on completing the generation of a CG image,writing the CG image into the image storage means, the CG imageobtaining by projecting each object whose outline and location is shownby the object information onto a projection surface, as seen from acurrent position of the moving viewpoint shown by the positioninformation; and a video frame generating step fetching at least onevideo frame from the at least one video sequence at the video displayrate and writing the fetched at least one video frame over a CG image,the CG image being stored in the image storage means immediately priorto the time that the at least one video frame was fetched, wherein thegraphics generating step further performs rendering on each generated CGimage, and writes the rendered CG images into the image storage means,each object includes at least one video display area, the moving imagecombining apparatus combines, on at least one video screen located onthe projection surface, at least one video sequence and a CG image, eachvideo screen corresponding to a video display area, the objectinformation includes information showing an outline and location foreach video display area, the graphics generating step further calculatesscreen information showing an outline and location for each videoscreen, each video screen obtained by projecting a video display areashown by an outline and location in the object information onto theprojection surface, and the video frame generating step overwritesfetched video frames at each location shown by the screen information,so that each fetched video frame fits an outline shown in the screeninformation.
 2. The moving image combining apparatus of claim 1,wherein: the video frame generating means generates transformed videoframes by transforming the fetched video frames to fit an outline shownin the screen information; and overwrites the transformed video framesinto the image storage means.
 3. The moving image combining apparatus ofclaim 2, wherein: each object has a plurality of video display areas;the video obtaining means obtains a plurality of video sequences from anexternal source; the moving image combining apparatus combines, on eachof a plurality of video screens on a projection surface, one of thevideo sequences with a CG image, each video screen corresponding to oneof the plurality of video display areas; the object information includesinformation showing outlines and locations for a plurality of videodisplay areas; the graphics generating means calculates screeninformation for each piece of information showing the outline andlocation for one of the plurality of video display areas; and the videoframe generating means fetches video frames from each of the pluralityof video sequences, and overwrites fetched video frames from thedifferent video sequences at the different locations shown by theplurality of pieces of screen information, so that the fetched videoframes fit the outlines shown in the screen information.
 4. The movingimage combining apparatus of claim 3, wherein the video frame generatingmeans includes: a priority ranking determining means for determining apriority ranking for each video screen based on the plurality of piecesof calculated screen information; a video decoding means for obtainingvideo frames from each of the plurality of video sequences, based on thedetermined priority ranking; a masking location calculating means forcalculating locations to be masked on each video screen, based on theplurality of pieces of calculated screen information and the priorityranking determined for each video screen; and a masking means formasking the transformed video frames at the calculated locations,wherein the video frame generating means overwrites the transformedvideo frames which have been masked into the image storage means.
 5. Themoving image combining apparatus of claim 4, wherein the priorityranking determining means determines priority rankings using theplurality of pieces of calculated screen information, with video screensnearer to the viewpoint having a higher priority ranking.
 6. The movingimage combining apparatus of claim 4, wherein the priority rankingdetermining means determines priority rankings using the plurality ofpieces of calculated screen information, with video screens calculatedas having a larger surface area having a higher priority ranking.
 7. Themoving image combining apparatus of claim 4, wherein the video decodingmeans obtains all of the video frames from a video sequence with thehighest priority ranking, and omits more video frames from videosequences with lower priority rankings.
 8. The moving image combiningapparatus of claim 4, wherein the video decoding means includes an imagequality adjustment unit reducing luminance of obtained video frames, anddoes not reduce the luminance of video frames from the video sequencewith the highest priority ranking, while reducing the luminance of videoframes from video sequences with lower priority rankings.
 9. A movingimage combining apparatus combining three-dimensional CG images and atleast one video sequence composed of a plurality of video frames, themoving image combining apparatus comprising: an information storagemeans for storing object information showing an outline and location foreach object, and an outline and location for at least one video displayarea for each object, a video obtaining means for obtaining from anexternal source at least one video sequence composed of a plurality ofvideo frames generated at a fixed video display rate; a CG image storagemeans; a video frame storage means; an image storage means; a receivingmeans for receiving position information showing a position of a movingviewpoint; a graphics generating means for (1) generating CG images oneat a time at a graphics display rate and on completing the generation ofa CG image, writing the CG image into the CG image storage means, the CGimage obtained by projecting each object whose outline and location isshown by the object information onto a projection surface, as seen froma current position of the moving viewpoint shown by the positioninformation, and (2) calculating screen information showing an outlineand location for at least one video screen obtained by projecting eachvideo display area shown by an outline and location in the objectinformation onto the projection surface; a video frame generating meansfor fetching at least one video frame from the at least one videosequence at the video display rate and overwriting the fetched at leastone video frame into the video frame storage means; and a selectingmeans for (1) selecting elements forming still images from the at leastone video frame written in the video frame storage means and a CG imagewritten in the CG image storage means, the CG image being written in theCG image storage means immediately prior to the time that the at leastone video frame was fetched, and (2) writing the selected elements inthe image storage means, the CG image storage means includes a firstgraphics storage unit and a second graphics storage unit; the videoframe storage means includes a first video storage unit and a secondvideo storage unit; the graphics generating means writes obtained CGimages alternatively in the first and second graphics storage units; thevideo frame generating means writes obtained video frames alternativelyin the first and second graphics storage units; and the selecting means:(1) reads a CG image from the second graphics storage unit while thegraphics generating unit is writing a CG image into the first graphicsstorage unit, and (b) reads a CG image from the first graphics storageunit while the graphics generating means is writing a CG image into thesecond graphics storage unit, (2) reads a video frame from the secondvideo storage unit while the video frame generating unit is writing avideo frame into the first video storage unit, and (b) reads a videoframe from the first video storage unit while the video frame generatingunit is writing a video frame into the second video storage unit, and(3) selects elements forming still images from the read CG images andvideo frames, wherein the graphics generating means further performsrendering on each generated CG image, and writes the rendered CG imagesinto the image storage means, the video frame generating means generatestransformed video frames by transforming the fetched video frames to fitan outline shown in the screen information; and overwrites thetransformed video frames into the image storage means.
 10. The movingimage combining apparatus of claim 1, wherein: each object has aplurality of video display areas; the video obtaining means obtains aplurality of video sequences from an external source; the moving imagecombining apparatus combines, on each of a plurality of video screens ona projection surface, one of the video sequences with a CG image, eachvideo screen corresponding to one of the plurality of video displayareas; the object information includes information showing outlines andlocations for a plurality of video display areas; the graphicsgenerating means calculates screen information for each piece ofinformation showing the outline and location for one of the plurality ofvideo display areas; and the video frame generating means fetches videoframes from each of the plurality of video sequences, and overwritesfetched video frames from the different video sequences at the differentlocations shown by the plurality of pieces of screen information, sothat the fetched video frames fit the outlines shown in the screeninformation.
 11. The moving image combining apparatus of claim 10,wherein the video frame generating means includes: a priority rankingdetermining means for determining a priority ranking for each videoscreen based on the plurality of pieces of calculated screeninformation; a video decoding means for obtaining video frames from eachof the plurality of video sequences, based on the determined priorityranking; a masking location calculating means for calculating locationsto be masked on each video screen, based on the plurality of pieces ofcalculated screen information and the priority ranking determined foreach video screen; and a masking means for masking the transformed videoframes at the calculated locations, wherein the video frame generatingmeans overwrites the transformed video frames which have been maskedinto the image storage means.
 12. The moving image combining apparatusof claim 11, wherein the priority ranking determining means determinespriority rankings using the plurality of pieces of calculated screeninformation, with video screens nearer to the viewpoint having a higherpriority ranking.
 13. The moving image combining apparatus of claim 11,wherein the priority ranking determining means determines priorityrankings using the plurality of pieces of calculated screen information,with video screens calculated as having a larger surface area having ahigher priority ranking.
 14. The moving image combining apparatus ofclaim 11, wherein the video decoding means obtains all of the videoframes from a video sequence with the highest priority ranking, andomits more video frames from video sequences with lower priorityrankings.
 15. The moving image combining apparatus of claim 11, whereinthe video decoding means includes an image quality adjustment unitreducing luminance of obtained video frames, and does not reduce theluminance of video frames from the video sequence with the highestpriority ranking, while reducing the luminance of video frames fromvideo sequences with lower priority rankings.
 16. A moving imagecombination method for combining CG images and at least one videosequence composed of a plurality of video frames, the moving imagecombining method used by a moving image combination apparatus having aninformation storage means and an image storage means, the informationstorage means storing object information showing an outline and locationfor at least one object in three-dimensional space, and the moving imagecombining method comprising: a video obtaining step obtaining from anexternal source at least one video sequence composed of a plurality ofvideo frames generated at a fixed video display rate; a receiving stepreceiving position information showing a position of a moving viewpoint;a graphics generating step generating CG images one at a time at agraphics display rate and, on completing the generation of a CG image,writing the CG image into the image storage means, the CG image obtainedby projecting each object whose outline and location is shown by theobject information onto a projection surface, as seen from a currentposition of the moving viewpoint shown by the position information; anda video frame generating step fetching at least one video frame from theat least one video sequence at the video display rate and writing thefetched at least one video frame over a CG image, the CG image beingstored in the image storage means immediately prior to the time that theat least one video frame was fetched, wherein the graphics generatingstep further performs rendering on each generated CG image, and writesthe rendered CG images into the image storage means, each objectincludes at least one video display area, the moving image combiningapparatus combines, on at least one video screen located on theprojection surface, at least one video sequence and a CG image, eachvideo screen corresponding to a video display area, the objectinformation includes information showing an outline and location foreach video display area, the graphics generating step further calculatesscreen information showing an outline and location for each videoscreen, each video screen obtained by projecting a video display areashown by an outline and location in the object information onto theprojection surface, and the video frame generating step overwritesfetched video frames at each location shown by the screen information,so that each fetched video frame fits an outline shown in the screeninformation.
 17. A moving image combining method for combining, on avideo display area, CG images and at least one video sequence composedof a plurality of video frames, the moving image combining method usedby a moving image combining apparatus having an information storagemeans, a CG image storage means having a first graphics storage unit anda second graphics storage unit, a video frame storage means having afirst video storage unit and a second graphics storage unit, and animage storage means, the information storage means storing objectinformation showing an outline and location for at least one object, andan outline and location for a video screen for each object, inthree-dimensional space and the moving image combining methodcomprising: a video obtaining step obtaining from an external source atleast one video sequence composed of a plurality of video framesgenerated at a fixed video display rate; a receiving step receivingposition information showing a position of a moving viewpoint; agraphics generating step (1) generating CG images one at a time at agraphics display rate, rendering on each generated CG image, and, oncompleting the rendering of a CG image, writing the rendered CG imageinto alternately the first and second graphics storage unit, the CGimages obtained by projecting each object whose outline and location isshown by the object information onto a projection surface, as seen froma current position of the moving viewpoint shown by the positioninformation and (2) calculating screen information showing an outlineand location for at least one video screen, the video screen obtained byprojecting the at least one video display area shown by an outline andlocation in the object information onto the projection surface; a videoframe generating step fetching video frames from the video sequence atthe video display rate and overwriting the fetched video framealternately in the first and second graphics storage units; reading a CGimage from the second graphics storage unit while the graphicsgenerating unit is writing a CG image into the first graphics storageunit, and reading a CG image from the first graphics storage unit whilethe graphics generating means is writing a CG image into the secondgraphics storage unit; reading a video frame from the second videostorage unit while the video frame generating unit is writing a videoframe into the first video storage unit, and reading a video frame fromthe first video storage unit while the video frame generating unit iswriting a video frame into the second video storage unit; selectingelements forming still images from the read CG images and video frames,the CG image being written in the CG image storage means immediatelyprior to the time that a video frame was fetched; and writing theselected elements into the image storage means.
 18. A recording mediumrecording a moving image combining program combining CG images and atleast one video sequence composed of a plurality of video frames, themoving image combining program used by a computer having an informationstorage means and an image storage means, the information storage meansstoring object information showing an outline and location for at leastone object in three-dimensional space, and the moving image combiningprogram comprising: a video obtaining step obtaining from an externalsource at least one video sequence composed of a plurality of videoframes generated at a fixed video display rate; a receiving stepreceiving position information showing a position of a moving viewpoint;a graphics generating step generating CG images one at a time at agraphics display rate and, on completing the generation of a CG image,writing the CG image into the image storage means, the CG image obtainedby projecting each object whose outline and location is shown by theobject information onto a projection surface, as seen from a currentposition of the moving viewpoint shown by the position information; anda video frame generating step fetching at least one video frame from theat least one video sequence at the video display rate and writing thefetched at least one video frame over a CG image, the CG image beingstored in the image storage means immediately prior to the time that theat least one video frame was fetched, wherein the graphics generatingstep further performs rendering on each generated CG image, and writesthe rendered CG images into the image storage means, each objectincludes at least one video display area, the computer combines, on atleast one video screen located on the projection surface, at least onevideo sequence and a CG image, each video screen corresponding to avideo display area, the object information includes information showingan outline and location for each video display area, the graphicsgenerating step further calculates screen information showing an outlineand location for each video screen, each video screen obtained byprojecting a video display area shown by an outline and location in theobject information onto the projection surface, and the video framegenerating step overwrites fetched video frames at each location shownby the screen information, so that each fetched video frame fits anoutline shown in the screen information.
 19. A recording mediumrecording a moving image combining program combining, on a video displayarea, CG images and at one least video sequence composed of a pluralityof video frames, the moving image combining program used by a computerhaving an information storage means, a CG image storage means having afirst graphics storage unit and a second graphics storage unit, a videoframe storage means having a first video storage unit and a secondgraphics storage unit, and an image storage means, the informationstorage means storing object information showing an outline and locationfor at least one object, and an outline and location for a video screenfor each object, in three-dimensional space and the moving imagecombining method comprising: a video obtaining step obtaining from anexternal source at least one video sequence composed of a plurality ofvideo frames generated at a fixed video display rate; a receiving stepreceiving position information showing a position of a moving viewpoint;a graphics generating step (1) generating CG images one at a time at agraphics display rate, rendering on each generated CG image, and, oncompleting the rendering of a CG image, writing the rendered CG imageinto alternately the first and second graphics storage unit, the CGimages obtained by projecting each object whose outline and location isshown by the object information onto a projection surface, as seen froma current position of the moving viewpoint shown by the positioninformation and (2) calculating screen information showing an outlineand location for at least one video screen, the video screen obtained byprojecting the at least one video display area shown by an outline andlocation in the object information onto the projection surface; a videoframe generating step fetching video frames from the video sequence atthe video display rate and overwriting the fetched video framealternately in the first and second graphics storage units; reading a CGimage from the second graphics storage unit while the graphics,generating unit is writing a CG image into the first graphics storageunit, and reading a CG image from the first graphics storage unit whilethe graphics generating means is writing a CG image into the secondgraphics storage unit; reading a video frame from the second videostorage unit while the video frame generating unit is writing a videoframe into the first video storage unit, and reading a video frame fromthe first video storage unit while the video frame generating unit iswriting a video frame into the second video storage unit; selectingelements forming still images from the read CG images and video frames,the CG image being written in the CG image storage means immediatelyprior to the time that a video frame was fetched; and writing theselected elements into the image storage means.